Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove openblas set_num_threads in julia __init__ #42442

Merged
merged 2 commits into from
Oct 2, 2021
Merged

Conversation

ViralBShah
Copy link
Member

@ViralBShah ViralBShah commented Oct 1, 2021

We used to set the number of threads to a conservative number, since openblas thread detection used to be less reliable than it is today. Hopefully this leads to better out of the box performance, since it is quite common for people to have more than 8 cores!

cc @chriselrod @staticfloat

@ViralBShah ViralBShah added linear algebra Linear algebra backport 1.6 Change should be backported to release-1.6 backport 1.7 labels Oct 1, 2021
@ViralBShah
Copy link
Member Author

This needs to be followed up with increasing of the max num threads as discussed in this PR, so that users can start getting good linear algebra performance out of the box.

JuliaPackaging/Yggdrasil#3667

@ViralBShah
Copy link
Member Author

ViralBShah commented Oct 1, 2021

Getting 2x peakflops() already with just this PR on a 20 core box.

@DilumAluthge
Copy link
Member

The Buildkite tester_linux64_st job ("st" = "single-threaded") keeps failing, even when I retry it.

Are the failures related to the contents of this PR?

@JeffBezanson
Copy link
Member

👍 Do we want to do anything if the number of threads is specified, e.g. -t 4, or just leave openblas separate from our thread selection?

@oscardssmith
Copy link
Member

I think we should leave it separate unless and until we can integrate OpenBlas threads with Julia threads.

@ViralBShah
Copy link
Member Author

What's the issue with the single thread failure?

@DilumAluthge
Copy link
Member

I don't know, but I've retried it three times now, and it fails every time.

The same job (tester_linux64_st) is passing on master, so I think the failures might be due to this PR.

@DilumAluthge
Copy link
Member

It looks like the failure is happening in the Distributed test set. @vchuravy could you take a look?

@DilumAluthge
Copy link
Member

DilumAluthge commented Oct 1, 2021

Distributed                         (1) |        started at 2021-10-01T16:22:31.825
Test Failed at /cache/build/amdci7-3/julialang/julia-master/julia-710a3ca24d/share/julia/stdlib/v1.8/Distributed/test/distributed_exec.jl:1034
  Expression: master_blas_thread_count <= 8
   Evaluated: 32 <= 8
ERROR: LoadError: There was an error during testing
in expression starting at /cache/build/amdci7-3/julialang/julia-master/julia-710a3ca24d/share/julia/stdlib/v1.8/Distributed/test/distributed_exec.jl:1068
Distributed                         (1) |         failed at 2021-10-01T16:23:27.146

@vchuravy
Copy link
Member

vchuravy commented Oct 1, 2021

Yeah @ViralBShah that test in Distributed.jl needs to be adjusted, since it is explicitly testing that invariant.

@DilumAluthge
Copy link
Member

What would be the most correct way to fix the test?

@ViralBShah
Copy link
Member Author

Note impact on startup times discussed in JuliaPackaging/Yggdrasil#3667 (comment)

This will remove the cap of 8 threads, and thus add something like 50ms more when using 32 threads (if the hardware does have that many cores).

@DilumAluthge
Copy link
Member

For Buildkite CI, we probably still want to limit the OpenBLAS thread count to avoid each Buildkite job from starting e.g. 64 OpenBLAS threads.

#42470

@ViralBShah ViralBShah merged commit 65eff6d into master Oct 2, 2021
@ViralBShah ViralBShah deleted the vs/openblas-init branch October 2, 2021 11:48
@KristofferC
Copy link
Member

KristofferC commented Oct 2, 2021

Personally, I think adding a 25% increased startup time for everyone with high core system is a bit much. It feels more reasonable to opt into that if you know you think you think you are going to use it. A lot of work has gone in to push that startup time down.

@ViralBShah
Copy link
Member Author

ViralBShah commented Oct 2, 2021

I should also point out that this 50ms startup time is not purely a startup time hit. It includes doing using LinearAlgebra, querying and showing the number of threads.

If I do a pure startup time test, there is no perceptible difference. I get 0.16s with 8 threads and 0.17s with 32 threads on a fairly old machine. In many runs I get the same time for 8 threads and 32 threads, which suggests a difference on the order of 5ms. I feel this is reasonable - that there is a little bit more of an impact when you enter the BLAS, but pure startup remains unaffected.

@vchuravy vchuravy removed the backport 1.6 Change should be backported to release-1.6 label Oct 2, 2021
@vchuravy
Copy link
Member

vchuravy commented Oct 2, 2021

I don't think this should necessarily be backported to 1.6. Changing the default behaviour like that seems okay for a minor release, but not in a patch release.

@KristofferC
Copy link
Member

I don't think it should go in 1.7 either. At least not in 1.7. 0. The impact on loading times is unclear and since this has not been tested on master at all it introduces some uncertainty and we don't want that for a heavily delayed release.

@stevengj
Copy link
Member

Could OpenBLAS be set up to initialize lazily, i.e. it waits to spawn threads until you call an OpenBLAS function for the first time? That way you wouldn't pay this startup cost for code that is not using linear algebra.

LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Feb 22, 2022
* Remove openblas_set_num_threads in julia __init__

* Remove test no longer needed.
LilithHafner pushed a commit to LilithHafner/julia that referenced this pull request Mar 8, 2022
* Remove openblas_set_num_threads in julia __init__

* Remove test no longer needed.
Keno pushed a commit that referenced this pull request Jun 5, 2024
* Remove openblas_set_num_threads in julia __init__

* Remove test no longer needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linear algebra Linear algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants